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The largest eigenvalue of a convex function, duality, and a theorem 

of Slodkowski 

Matthew Dellatorre 


Abstract 

First, we provide an exposition of a theorem due to Slodkowski regarding the largest “eigenvalue” of a 
convex function. In his work on the Dirichlet problem, Slodkowski introduces a generalized second-order 
derivative which for C 2 functions corresponds to the largest eigenvalue of the Hessian. The theorem 
allows one to extend an a.e lower bound on this largest “eigenvalue” to a bound holding everywhere. 
Via the Dirichlet duality theory of Harvey and Lawson, this result has been key to recent progress on 
the fully non-linear, elliptic Dirclilet problem. Second, using the Legendre-Fenchel transform we derive 
a dual interpretation of this largest eigenvalue in terms of convexity of the conjugate function. This dual 
characterization offers more insight into the nature of this largest eigenvalue and allows for an alternative 
proof of a bound needed for the theorem. 
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1. Introduction 


1.1 Motivation 

It is known that a convex function u on 1" is differentiable almost everywhere and has distributional second- 
order partial derivatives. It is also known that a convex function is twice differentiable almost everywhere 
in the sense that for a.e. x € R™, there exists a symmetric positive semi-definite matrix D 2 f{x) such that 

f(x + h) = f(x) + (V f(x), h) + D 2 f(x)h, h ) + o(||/i|| 2 ). 

The operator D 2 f is called the second-order Peano derivative. Note that its existence does not imply the 
existence of V/ in a neighbourhood, so it should not be considered the second derivative of / in the usual 
sense. This result is due to Alexandrov [1]. See also [5], [6]. 

In [9], Slodkowski studies uniqueness for a generalized Dirichlet problem in the class of q— plurisubharmonic 
(g-psh) functions (for C 2 functions on C™ this is equivalent to the complex Hessian having n — q nonnegative 
eigenvalues at every point). The problem of uniqueness reduces to showing that the difference of two such 
functions is n — 1 -psh, which implies that it satisfies a maximum principle, from which uniqueness then 
follows. Functions of this g-psh class can be approximated by a subclass which are convex up to a quadratic 
polynomial. Because of this it is sufficient to study this smaller class, which given their quasi-convexity, 
retain some of the nice properties of convex functions. In particular, quasi-convex functions are a.e. twice 
differentiable, in the above sense. Thus, the second-order behavior of these functions and their difference is 
known a.e. However, to show that the difference is a member of the above mentioned class, they must satisfy 
this eigenvalue property everywhere. To this end, Slodkowski introduces a generalized second-order deriva¬ 
tive, which is simply the largest eigenvalue of the Hessian for C 2 functions, and proves that if this quantity 
is bounded below almost everywhere in some domain, it is bounded below everywhere in that domain. Using 
this, he shows that the difference is contained in the desired n—1 -psh class. 

Following Slodkowski [9, §3 ], we define the largest “eigenvalue” of a convex function. 

Definition 1.1. Let u : 1" —> R. //Vit( xo) exists, K(u,xq) is defined by the formula 

K(u,Xo) = limsup2e -2 max {u(xo + eh) — u(x o) — e(\7u(xo),h) : h € S'™ -1 } 

e-s-0 

otherwise K(u,x) is defined as + 00 . 

This is the generalized second-order derivative that Slodkowski defines. For the sake of context, note 
that this quantity is a modification to the second-order upper Peano derivative of u in the direction of h, 
which is defined as 

limsup 2e~ 2 (u(xo + eh) — u(x 0 ) — e(W u(xq) , h)). 

€—> 0 + 

Being maximal, this second-order derivative is of particular interest because it corresponds to the largest 
eigenvalue of the Hessian when defined (which it does, in the above sense, almost everywhere for convex 
functions), and gives a useful quantity to work with otherwise, especially in the context of Slodkowski’s C 1,1 
estimates. 

Regarding this quantity K(u,x), Slodkowski shows the following. 

Theorem 1.2. ([9, Cor. 3.5]) Let u : 1™ —>■ R be a locally convex function in U C R™, such that K(u,x) > 
M for almost every x € U. Then K(u,x) > M for all x € U. 

As mentioned above, the recent work of Harvey and Lawson on the Dirichlet problem was one of our 
motivations for studying this quantity K(u,x) and Slodkowski’s proof of the above result. In [4] they study 
fully non-linear degenerate elliptic equations of the form 

F(Hess(u)) =0onfl (1) 

u = ej> on dfl. (2) 
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Given certain convexity assumptions on the boundary, they establish the existence and uniqueness of con¬ 
tinuous solutions using their new Dirichlet duality theory. The work of Slodkowski was “an inspiration” for 
that paper, and in particular Theorem 1.2 is the “deepest ingredient” of their proof of uniqueness of viscos¬ 
ity solutions of (1) [4, p. 398]. These existence and uniqueness results apply to many important problems 
including all branches of the homogeneous Monge-Ampere equation, all branches of the special Lagrangian 
potential equation, and equations appearing naturally in Lagrangian and calibrated geometry. 

Given the usefulness of this generalized derivative and the above result to recent progress on important 
problems, it makes sense to better understand both the derivative and the proof of the theorem. The proof 
is fairly difficult and very geometric so here an illustrated exposition is provided. The quantity K(u,x) 
is then studied further for convex u. In particular, the Legendre-Fenchel transform is applied to give a 
simple alternative characterization of K(u,x) in terms of the convexity of the dual function u* to u. This 
allows for an alternative proof to a key proposition needed to prove Slodkowski’s theorem. Altogether, there 
are now three ways to view this generalized derivative K(u,x): analytically (Definition 1.1), geometrically 
(Proposition 1.6), and dually (Theorem 1.9). 


1.2 Summary 


Theorem 1.2 follows immediately from the following theorem, the proof of which is the main focus of the 
first part of this paper. 


Theorem 1.3. ([9, Thm. 3.2]) Let u be convex near xg G R”. Assume that K(u,xg) = 
for every k > ko the set {x : K(u, x) < k} is Borel and its lower density at Xg is not less 


kg is finite. Then 
than (^) • 


Lower density is defined as follows. 

Definition 1.4. The lower density of a Lebesgue measurable set Z C 

m n (Z n B(xg,e )) 


at xg G R n is the number 


liminf- . . 

m n (B(x 0 ,e)) 


where m n denotes the n-dimensional Lebesgue measure. 


Slodkowski’s proof of Theorem 1.3 divides naturally into two parts. First, an equivalent geometric 
characterization of a bound on K(u,x) is given in terms of spheres tangent to the graph of u. This is the 
content of the following definition and proposition. 

For c = (ci,..., c n +i) G R", let S(c,r ) denote the n— sphere with center c and radius r, and B(c,r ) 
denote the open n + 1-disk of radius r centered at c. 

Definition 1.5. The sphere S(c,r) is a sphere of support from above at y = (xg,u(xg)) if y G S(c,r), 
B(c,r) fl graph(u) = 0 and c n +i > u(P(c)), where P denotes the orthogonal projection of R n+1 onto R". 

Thus, S(c,r) can be visualized as a ball resting on a “surface” that is the graph of u, and such that 
( Xg,u(xg )) is one of its resting points. 


Proposition 1.6. ([9, Prop. 3.3]) Let U C R" be open and u : U —> R be convex. Assume that u has 
gradient at x. 


(i) If u has second-order Peano derivatives at x, then K(u,x) is equal to the norm (i.e. the largest 
eigenvalue) of the real Hessian of u at x. 


(ii) If K(u,x) is finite, then for every K > K(u,x) there is e > 0 such that u(x+h) — u(x)— (Vu(x), h) < 

(Hi) If there is a sphere S(c,r),r > 0 which supports the graph of u from the above at (x,u(x)), then 


K(u,x ) < 


(1 + |Vu(a;)| 2 ) 2 
r 


( 3 ) 
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Parts (ii) and (iii) give the above mentioned equivalence between a bound on K (it, x) and a sphere of support 
to the graph of a corresponding radius at (x,u(x)). See section 2.2 for a more detailed explanation. 

The second part of the proof then uses this alternative characterization of K(u,x) to obtain a density 
result, which is essentially the statement of the theorem in terms of spheres of support as opposed to K(u, x). 
This is the content of the following lemma. 

Lemma 1.7. ([9, Lemma 3.4]) Let u be a non-negative convex function in B(0, d) C R", d > 0, such that 
u( 0) = 0 and Vtt(0) = 0. Let R > 0 and assume that the closed ball B(c,R), c = (0, ...,0,1?) £ R" +1 , 
intersects the graph of u only at 0 6 R” +1 . Let X r , 0 < r < R denote the set of all x € B(0,d) C R™ 
such that there exists a sphere of radius r supporting the graph of u from above at (a;,«(#)). Then the lower 
density of X r at 0 is not less than {{R — r)/2R) n . 

As will be seen in more detail in section 2, there is an inverse relationship between the bound on K(u, x) and 
the radius of the sphere of support to the graph of u at (x,u{x)). This will explain the similarity between 
the lower bound on density given in the lemma and the one in the theorem. 

The geometric characterization of K(u,x ) is key to proving Theorem 1.3 and helpful in understanding 
what quality this generalized derivative captures about the function u and its graph. Since the results here 
concern functions that are at least locally convex, it is natural to study them via the Legendre-Fenclrel 
transform, the classical transform of convex analysis. By definition, the set of points above the graph of 
a convex function (epigraph) is a convex set. Any convex set in R" can be defined entirely by a family 
of supporting hyperplanes. Thus, since the epigraph of u completely determines the graph of it, which in 
turn completely determines u, this family of hyperplanes can be considered an alternative description or 
parametrization of u. This is essentially how the transform of u (or dual function to it) u* is defined. Each 
point p £ R" defines a collection of hyperplanes (via gradient), and it* specifies a point u*(p) £ R, such that 
(0,..., 0, — u*(p)) £ R" lies on the one hyperplane of this collection which supports the epigraph (or graph) 
of u. 

Interestingly, under the Legendre-Fenchel transform, differentiability properties of u correspond to con¬ 
vexity properties of it*. Two classic examples of this are the following. 

Proposition 1.8. Let f : R" —> R. Then 

(i) f is strictly convex if and only if f* is differentiable. 

(ii) f is strongly convex with modulus c if and only if f* is differentiable and V/* is Lipschitz continuous 
with constant 

C 

Given that K(u, x) is a (local) differentiability property of u, it seems there should be an appropriate (local) 
convexity property corresponding to u*. In section 3 we prove the following result. 

Theorem 1.9. Let f : R n —> R be convex. If K(f,x o) = ko < k then f* is quadratically convex at yo = 
V/(x o) with modidus \. Conversely, if u* is quadratically convex with modulus then K(f,x o) = kg < k. 

Quadratically convex at yo, which is defined in section 3, is a more local form of convexity than the two types 
of convexity referred to in Proposition 1.8. This dual characterization of K(u,x) allows for an alternative 
proof of Proposition 1.6. Using quadratics to define different types of convexity is standard (e.g. quasi¬ 
convexity, strong convexity). See section 3 for definitions of all these terms and a more detailed discussion. 

In Slodkowski’s proof quadratics arise naturally via the definition of K(u, x), and from this, spheres. The 
geometric properties of spheres make certain arguments very clear (see proof of Lemma 1.7), however some 
manipulations and calculations are simpler with quadratics, given their constant second-order behavior. For 
example, in [5] Harvey and Lawson provide an alternative proof of Sloclkoski’s lemma (as well as Alexandrov’s 
theorem stated above) via a generalization by using quadratics instead of spheres. Their proof is modelled 
off of Slodkowski’s, and they obtain their result for the larger class of quasi-convex functions. Instead of 
spheres of support, they use the notion of upper contact jets, where given p £ R", and A a real symmetric 
n x n matrix, ( p , A) is an upper contact jet for u at x if there exists a neighbourhood of x such that 

u{y) < u{x) + {p,y-x) + ^{A(y - x), y - x). 

Slodkowski’s result then corresponds to A = XI. 
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1.4 Organization 

Section 2 contains the exposition of Slodkowski’s proof of Theorem 1.3: §2.1 gives an overview of the proof, 
§2.2 a slight variation of Slodkowski’s proof to Proposition 1.6 (the generalized C 1,1 estimate), §2.3 an 
expanded and illustrated version of Slodkowski’s proof to Lemma 1.7, and §2.4 combines these for the proof 
of the theorem. 

Section 3 studies K(u,x) from the dual perspective: §3.1 recalls some basic convex analysis, including 
Legendre-Fenchel duality, §3.2 provides an equivalent interpretation of I\ (u, x) in terms of the dual function 
to u, and uses this for an alternative proof of the C 1,1 estimate. 

The Appendix considers Lipschitz continuity of the gradient and the geometric interpretation of K(u,x ): 
§A.l demonstrates K(u,x) is bounded by the Lipschitz constant when u is C 1 ’ 1 , §A.2 gives an example of 
a function with a sphere of support that is not C 1,1 on any neighbourhood, §A.3 compares K(u,x ) to the 
classical notion of an osculating circle to a plane curve and gives an extension of this to higher dimensions, 
§A.4 relates the radius of a sphere of support to a function to that of the radius of a supporting sphere to 
its dual. 


2. Exposition of Slodkowski’s proof 


2.1 Overview 

Theorem 1.3 is concerned with the set of points (near xo ) such that K(u,x) < k, for some fixed k > ko = 
K{u, Xo). However this set may be difficult to study directly given that the only information available about 
u is that it is continuous (bounded and convex) on some neighbourhood of xo and K(u,xq) = ko < oo. 
In particular, knowing the value of K{u,x) at a given point does not immediately suggest anything about 
its value nearby. Thus, the first step towards a better understanding of this set of points is an alternative 
characterization of what it means for K(u,x) to bounded at some point. 

If at the point x, K{u,Xq) < oo this is equivalent to a (local) sphere of support from above to the graph 
of u at (x, u(x)). This is precisely what Proposition 1.6 (ii) and (iii) states, (ii) implies the existence (locally) 
of a quadratic function tangent to the graph of u at (x, u(x)) which majorizes u on some neighbourhood, and 
this in turn implies the (local) existence of a sphere of support to the graph of u at ( x,u(x )). The content 
of (iii) is clear. 

With this alternative geometric characterization in hand, Lemma 1.7 then proves the theorem in terms 
of these spheres of support. To accomplish this another change in perspective is needed, which takes further 
advantage of this more geometric interpretation of K(u, x). Instead of looking at points x in the domain of 
u such that there exists a sphere of support to the graph of u at (a :,u(x)), it is better to consider for each 
point x in domain of u an n —sphere (of fixed radius) in R" +1 above the graph of u with center c £ R™ +1 
such that P(c ) = x, where P : R” +1 —> R” is the projection map. If we lower this sphere down towards x 
it will of course eventually intersect the graph of u. Since u is continuous, it is not difficult to show that 
on a small enough neighbourhood these spheres will come down on a closed part of the graph of u and thus 
there will be an initial point of contact. This sphere is by definition a sphere of support to the graph of u 
at that point. The next step is to show that for every e neighbourhood of 0 ( Xq = 0 for Lemma 1.7) there 
is a corresponding <5 = <j(e) such that the spheres above the points in B( 0, S) are spheres of support to the 
graph at points ( x , u(a:)), where x £ B( 0, e). Now B( 0, S) is a much nicer set to work with then X r fl B( 0, e), 
and these two sets can be related by a few simple Lipschitz maps. Since Lipschitz maps behave nicely with 
respect to measures, this allows us to place a lower bound on the measure m(X r fl B( 0, e)) for each epsilon. 
A limiting argument is then used to obtain the lower bound on the lower density at 0. 

Proposition 1.6 and Lemma 1.7 can then be combined to give Theorem 1.3. A sketch of the proof is 
as follows. Start with a point Xq where K(u,Xq) is finite (hypothesis of Theorem 1.3), and choose any 
k > K(u,Xq). Note it can be assumed without loss of generality that xq = 0, u(0) = 0, and Vu(0) = 0 (see 
section 2.3 for details). Then apply Proposition 1.6 (ii), which locally gives a sphere of support of radius 
1/k at (xo,u(xo))- Now, apply Lemma 1.7 to get a lower bound on the density of X r , r < 1/fc, at Xo■ Next, 
apply Proposition 1.6 (iii) to convert this into a statement about the density of X k , where 

X' k = {x £ dom(it)|A'(u, x) < k }. 
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This last step is accomplished by using the continuity of the gradient to show that in a small enough 
neighbourhood X r C X' k . More explicitly, x £ X r implies K(u,x ) < r -1 (l + |Vu(x)| 2 ) 3 / 2 and Viifio) = 0, 
so by continuity of the gradient of convex functions and since k > 1/r, Vit( x) will eventually be small enough 
so that r -1 (l + |Vu(x)| 2 ) 3 / 2 < k. Thus, for x £ X r , K(u,x) < 1/k. This gives the theorem by choosing R 
arbitrarily close to 1/fco and r arbitrarily close to 1/k (see section 2.4 for a detailed proof). 

2.2. The generalized C 11 estimate 


R n 



Figure f: d : B(c,r) —> R 


In this subsection we provide an alternative proof to Proposition 1.6 (iii). The main idea is as follows: given 
a sphere of support of radius r to the graph of u at the point ( x,u(x )), the lower hemisphere of this sphere 
defines the graph of a smooth convex function that agrees up to first order with u at x and majorizes u 
elsewhere. Denote this function by d. It immediately follows that K(u,x) < K(d,x), and the rest of the 
proof consists in computing K(d,x), which is equal to the largest eigenvalue of d because d is smooth 
Proof of Proposition 1.6 (in). Assume that the sphere S((c,t),r), c £ R™ supports the graph of u from the 
above at (x, u(xo)) and that u is differentiable at Xq- Define d : B{c, r) —> M to be the function whose graph 
is the lower open hemisphere of S((c,t), 7'). Recall the definition for K(f,x o) : 

A'(it, Xo) := limsup 2e~ 2 max {u(x o + eh) — tt(xo) — e(Vu(xo), h) : |/i| = 1}. 

e ->0 


Clearly, since d{x o) = u(x o) and Vd(xo) = Vit(xo), 

K(u,x 0 ) < K(d,x 0 ). 

Since d is smooth, 

K(d, Xo) = limsup 2e -2 max {d(xo + eh) — d(x o) — e(Vd(xo), h) : \h\ = 1} 

e-i-0 

= limsup 2 e -2 max {-(V 2 d(xo + ^e^ehfeh, eh) : \h\ = 1}, 0 < 7 £j /i < I 

e—>o 2 

= limsup max {(V 2 d(xo +7 e ,h,eh)h,h) : \h\ = 1}, 0 < 7 e ,/t < 1 

= max {(\7 2 d(xo)h, h) : \h\ = 1 } by continuity and compactness. 

=A maX j maximum eigenvalue of V 2 d(xo) 

Thus, now we show that 

, _ (l + (Vu(x 0 )) 2 )l 

/'max — 

r 

The equation for d , the sphere of radius r centered at (c, t), where c £ R” and t £ R, is 

d(x) =t— y/r 2 — \c— x| 2 . 
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Without loss of generality we may assume that the sphere of support is centered at the origin and Xq has 
just first component non-zero, as otherwise we could always shift and then rotate without affecting the 
second-order behavior. In other words, assume (c,t) = 0 £ R" +1 and xq = ( si,...,s n ) = (s, 0,0) £ R”. 
Then d(x) = — y/r 2 — \x\ 2 . 

Let 1 

w(x ) := r2 _ g2 (V - x 0 \ 2 + 2(x - x 0 , x 0 )^j ■ 

Since 

\x\ 2 = (x, x) = ((X - X 0 ) + Xo, (x - X 0 ) + Xo) = \x - X 0 \ 2 + 2(x 0 , X - X 0 ) + \x 0 \ 2 
and \xq\ 2 = s 2 , we can write d(x) as 


d( x) = —\/r 2 — s 2 \j\ — w(x). 


Now expanding y/l — w(x) as a series and dropping the terms of order higher than two (as they will have 0 
Hessian at xq), 


This can be further reduced to 


m \ n - 2^1 w ( x ) 1 (2(x-x 0 ,x 0 )\ 2 \ 

d(x)~-Vr- ‘(1- — -g( r 2 — g 2 ) )’ 


since we are only concerned with the expression for d, modulo powers higher than two. 

Thus, d(x) has been replaced by a diagonal quadratic form and straightforward computations give 


Vd(x 0 ) = 


x 0 


yj r 2 s 2 


and 


V2 ^ = v / r2 _ 8 2 / + ( r 2_ 3 2 ) 3/2 ^ 


where I is the nxn identity matrix and A is the nx n matrix with first row xo = (s, 0,...,0) and zeros 
elsewhere. Since 


(r 2 — s 2 )i 


> 0 , 


it follows immediately that 


^max 


Vr 2 - s 2 (r 2 - s 2 )3/2 ( r 2 _ s 2)2 


Furthermore, the vector (xo,u(xo)) is of length r, proportional to the upward pointing unit normal to the 
graph of u at (xq, u(xq)), which is equal to 


Scaling by r, we obtain 

Giving 


V 1 + |Vu(x 0 )| : 


x 0 = 


-.(—Vu(xo), 1 ). 


\/i+W 


--Vu(xo). 


x 0 = 


rVu(x o) 


y/1 + |Vu(x 0 )|-‘ 


s = kol = 


2 _ r 2 \Wu{xl) 


1 + |Vu(x 0 )| 2 ' 
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Therefore, 


, _ (1 + |Vu(a:o)| 2 )5 

^max — 

r 

a 

We state explicitly the following interesting result on “lower hemisphere functions”, i.e. functions on a 
disc D C R" defined by the lower hemisphere of an n— sphere in R n+1 . The proof follows immediately from 
the above proof, by looking at the expression for the Hessian. 

Proposition 2.1. Let d : D —> R be a lower hemisphere function defined on a disc D C R" and x £ D. If 
Vcl(x) ^ 0, then X d(x) ^ 0 is an eigenvector of X 2 d(x) corresponding to the largest eigenvalue. 

Proof. Without loss of generality we may assume that the lower hemisphere and thus D are centered at the 
origin and x has only first coordinate non-zero, x = (s, 0, ...,0). Then, as shown above, the Hessian of d at 
x is a diagonal n x n matrix of the form 


X 2 d(x) = diag 





Thus, (1,0, ...,0) is an eigenvector corresponding to the largest eigenvalue. As calculated above, 


Vd(x) 




( 1,0 . 0 ), 


so clearly Xd(x) is also an eigenvector corresponding to the largest eigenvalue. 


□ 


2.3 The density lemma 

If at the point xq = 0 there is a sphere of support of radius R, Lemma 1.7 provides a lower bound on the 
lower density of the set X r of points with sphere of support of a radius r < R. Note that without loss of 
generality it may be assumed that x$ = 0, u(0) = 0, and Vtt(O) = 0, since any convex function u can always 
be adjusted by a constant and linear term so that this is true without affecting the 2nd-order behaviour of 
u. 

As mentioned in section 2.1, Lemma 1.7 is proved by looking not directly at X r but at small neighbour¬ 
hoods of 0 that are the projection of the set of centers of spheres of support to the graph of u on shrinking 
neighbourhoods. For each e > 0 a S = 5(e) is needed so that B( 0, 5) is contained in the projection onto R™ 
of the set of centers of spheres of support to the graph of u restricted to an epsilon neighbourhood. Since the 
only information about u is that there is a sphere of support at 0, this is what is used to construct e and 6. 
More specifically, the appropriate e’s and 5 's are found by constructing a family of convex functions that are 
identical to u on a neighbourhood of 0, but greater and simpler outside this neighbourhood. This allows one 
to fully utilize the only initial information given. Using this family of simple functions and basic geometry, 
three key set inclusions are obtained, which essentially relate B( 0, (5(e)) to X r nB(0, e). Then using Lipschitz 
maps to relate these sets and by applying properties of Lipschitz functions on measure, the lower density 
bound is shown. This whole construction is crucial because it provides a much simpler approach to studying 
the possibly very complex set X r . The following is the proof given by Slodkowski. 

Proof, of Lemma 1.7. 

The number r £ (0, R) will be kept fixed so let X = X r . Define 

Z = {(x,u(x)) £l " +1 : x £ X}. 

It is clear that Z (~l (B(0, d') x R) is compact for every d' < d, thus X fl (B(0, d') x R) is also compact, as it 
is the orthogonal projection P : R™ +1 —> R™ of Z. Since compact sets are Lebesgue measurable, the notion 
of lower density is applicable to both X and Z. 

It is more convenient to first estimate the density of Z at 0 with respect to Hausdorff measure, and then 
use the properties of Lipschitz functions on measure to obtain bounds on the density of A. To accomplish 



this a family of convex functions, built from the initial sphere of support of radius R at 0, which modify u 
outside a small neighbourhood of 0 will be constructed. As mentioned above, these functions will be identical 
to u on a neighbourhood of 0 and very simple outside this neighbourhood. These functions will enable us 
to find a corresponding S = 5(e) neighbourhood for each e so that x £ B(0,5) implies that x = P(c ), where 
c £ R™ is the center of a sphere of support to (x', u(x ')), for some x' £ B( 0, e) fl X r . 

Step One. A family of convex functions is constructed which will let us find an appropriate 5(e), as explained 
above. For each a such that 0 < a < \ arcsin(-^), define the function 

v a : B(0,R) — > [0, oo), 


as follows. First, define 

Y = {y £ M” +1 :\y-c\ = R, <(y - c, 0 - c) = 2a}, (4) 

where c = (0, ...,0 ,R) £ R” +1 is the center of the sphere of support to u at (0, u(0)). Y forms a “ring ”on 



S(c, R), and clearly the projection of Y, P(Y), onto R" is the n — 1 sphere of radius R sin 2a, centered at 0. 
Next, let C a denote the union of all closed segments wy with one endpoint w on the axis OxRc R ” +1 and 
tangent to the sphere S(c,R) at the other endpoint y, where y £ Y. Note that w is independent of which 
y £ Y that is being used. C a is simply a finite cone with vertex w and base Y, tangent to S(c, R) along Y. 
See Figure 2. 

Define now 


T a = {y £ S(c, R) '■ R( 1 - cos 2a) < y n+ 1 < R}. (5) 

T a can be visualized as a “strip” of S(c, r), and note that T a fl C a = Y and that T a U C a defines a convex 
function k a : B(0,R) —X R. 

For 0 < a < \ arcsin(}|), define 

! max(it(x), fc a (x)), \x\ < i?sin2a 
k a (x), R sin 2a < |ir| < R. 

Note that u is only defined on B( 0, d) and i? sin 2a < d < R, so that is why v a is defined this way. It is clear 
that 

v a (x) > u(x), for \x\ < d. (6) 
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Observe that v a is locally convex on the set |x| ^ R sin2a since for |ar| > Rsin2a,v a = k a (x), which is 
convex, and for |x| < R sin 2a, v a is the maximum of two convex functions which is convex. If |x| = R sin 2a, 
then (x,v a (x)) GYc S(c,r) Since S(c,r) lies above the graph of it, so k a \y > u\y. Thus near Y, v a = k a , 
and so v a is locally convex in 13(0, i?), which implies that v a is convex. 

Step Two. For any convex function the following Lipschitz map can be constructed. This will let us 
relate the possibly complex set, X, to the disk 13(0, (5(e)). Given a convex function v : B(0,R) R. Let 
E(v) = {(x, t) G R " +1 : t > u(x)} denote the strict epigraph of v, and define Z v as the set of all y = (x, v(x)), 
where |x| < R, and such that for some d G M” +1 , B(d,r) C E(v) and y G S(d,r), where r < R, as defined 
earlier. 

Note that if y = (x,u(x)) G Z v , then the graph(u) has a unique supporting hyperplane at y (since any 
such hyperplane is tangent to S(d,r)), and thus d is uniquely determined by y. 

Now consider the map : Z v —> R n+1 , where 'y v {y) = d. This map is Lipschitz with constant one. To 
see this, let 2 / 1 , 3/2 G Z v and c' = 7 v {yi),i = 1,2. The set E(v) is convex (by definition since v is convex), 
and so it contains W := co(B(ci,r) U I3(ci,r)), where co() denotes the convex hull. In particular, Wn 
graph(u)=0. Since y, G S(ci,r ) flgraph(u), y, G S(d i ,r) \ W,i = 1,2. Thus, yi and y 2 do not belong to, and 
are separated by, the open region between two hyperplanes which are orthogonal to the segment d 1 d 2 and 
pass through its ends. Therefore |c j — d 2 \ < |yi — 2 / 2 1- The importance of this map will be seen below, where 
combined with u and the projection map P it allows the set of interest in R" to be related to a small disk. 
Step Three. Three key set inclusions are established. Along with step two this will allow on small neigh¬ 
borhoods the measure of X to be bounded from below by the volume of small n— balls. Using the notation 
above, let Z a and 7 “ denote the set Z v and map 7 ^, respectively, for v = v a , where 0 < a < \ arcsin(A). 
Consider the set 


U a = graph(i; a ) \ [C a U T a ). (7) 

Note that this is a subset of the graph of u. For a G (0, \ arcsin(-^)), we have the following three inclusions: 

P(U a ) C 5(0, R sin 2a) (8) 

z a nu a c znu a (9) 

73jv(0, S) C Pr{Z a n U a ), where 5 = (R — r) tancc. (10) 

The first inclusion follows directly from the definition of U a : |x| > R sin2a => v a (x) G T a . 

By (5), Z“flgraph(ii) C Z. To see this, let 2 G Z a . Thus we have a d G R ™ +1 such that B(d,r) C E(v a ) 
and 2 G S(d, r). So there is a sphere of radius r supporting the graph of v a from above at 2. If 2 G graph(u), 
then we must have z G Z: B(d,r ) C E(v a ) and v a (x) > u(x) give us that B(d,r) (~1 graph(u) = 0 and 
d n+1 > u(Pd), which together with 2 G S(d,r) imply that 2 G Z, by definition. Since U a C graph(u), Z a fl 
U a cZ a n graph(it) C Z. And of course Z a fl U a C U a , so together we have Z a nU a C Z fl U a , which gives 
us the second inclusion. 

The third inclusion is the critical aforementioned relation between the set of points with spheres of 
support and a disk in R n . (Below we will take e = I? sin a and 6 = (R — r)tan a). To obtain this inclusion 
we proceed as follows. Let x G R", be such that |x| < i? — r, and consider the set 

{d G {x} x R : B(d,r) C E(v a )}. (11) 

This set is a non-empty, closed half-line. To see this, consider lowering the sphere S((x,d n+1 ),r) in R ™ +1 
onto the graph of v a , by continuously decreasing the last coordinate. Because the radius of this sphere is r 
and |x| < R — r, this sphere comes down on a closed subset of the graph of v a . Once contact is made with the 
graph of v a we stop, and the corresponding value of (x,d n+1 ) is our closed endpoint. Let d G R n+1 be this 
endpoint and y G S(d,r) fl graph(u a ) (note that y may not be unique). Then d = 7 Q (y) and x = P'y a (y), 
and so 

B n ( 0, R — r) C P^ a {Z a ). (12) 
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Now Z a \ (C a U T a ) C graph(u a ) \ {C a U T a ) = U a , so clearly Z a \ (C a U T a ) C Z a D XJ a . Therefore, 

P 7 “(Z“) \ p 7 “(z q n (c a u T a )) c Pj a (z a n u a ). (13) 

This relation and (12) will give us our third inclusion (9), once we show that 

Pj a (z a n(c a uT a ))nB N (o,S) = d}. (14) 

Consider the family of all spheres S(c ', r) which support C a \Y from above and are contained in the upper 
half space y n +1 > 0. Clearly the smallest value of |P(c')| is attained when the sphere S(c',r) is tangent to 
both C a and {y n + 1 = 0} (see Fig. 3). It is not difficult to see that in this case <(c' — c, 0—c) = a, where c here 
is the center of the initial sphere of support. This gives us |P(c')| = (|c| — c' N+1 ) tana = (R — r) tana = S , 
which implies 

P 7 “(2 a nc a )n5jv(0, l 5) = 0. (15) 

Now when S(P,r) supports T a \Y from the above at some point y , the segment c',y is normal to S(c,R) 
and vn+i > P(1 — cos 2a) > 6. Thus <£(c' — c, 0 — c) > 2a and, as above, |P(c)| > (P — r ) tan 2a > S (note 
0 < a < -|). This gives 

P 7 “(Z“ n T a ) n Pat(0, 6) = 0. (16) 

Combining (15) and (16) we have (13), which gives the third inclusion. 

Step Four. Estimate of the density of X. The above inclusions and the effect of Lipschitz maps on measure, 
will be enough to estimate the density of X = P{Z). Recall that Z = {(*,«(*)) £ K Ar+1 |a; £ X}, where X 
is the set of points in P(0,d) C such that there exists a sphere of radius r supporting the graph of u 
from above at (x,u(x)). 

Using a few theorems from Rockafellar [7], it can be shown that the map ip : P{U a ) —t U ai where 
< p(x ) = (x,u(x)) is Lipschitz with constant (1 + g^) 5 , where g a = sup{|Vu| : |a;| < Psin2a}. More 
specifically, by Theorem 10.4, u is Lipschitz, and by Theorems 24.7, 25.5, and 25.6 g a is a Lipschitz bound 
for 'u|s(o,/isin 2 a))- A simple Pythagorean argument then shows (1 + <7 q) 5 is a Lipschitz bound for ip. Notice 
that ip maps X D P(U a ) = P(Z fl U a ) onto Z D U a . 
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A basic theorem regarding the effect of Lipschitz maps on Hausdorff measures (Theorem 2.29 in Rogers 
[8]), along with our first inclusion from above (7), leads to: 

H n (Z n U a ) <(1 + gl)$m n (x n P(U a )) 

<(1 + <7a)^ m n (X fl B(0, e)), e = i?sin2a, 

where again H n and m n denote the Hausdorff and Lebesgue measure on 1", respectively. Furthermore 

m n (B(0,S)) <m n {P-/ a (Z a n U a )) by (9) 

<H n (Z a D U a ) P^ a is Lipschitz with constant < 1 

<H n (ZnU a ) by (8). 


Finally, combining these inequalities one obtains 

m n (X n B{ 0,e)) 


_ > (1 + 2^ ™n(B(0,d)) 

m n (B( 0,e)) +9a> m n (B(0,s)) 

= (1 + 9a) 2 


2 ((R — r) tan a 


— (1 + 9a) 2 


\ R sin 2a 
R~ r 


2 R 


_2 n 

cos a, 


where the volume of an n-ball of radius r is 
Thus, 


7r® r n 

F(f + 1) 


in the first equality, and T denotes the gamma function. 


r . f m n {X n B{ 0,e)) 

lim inr--—--—— 

e->o m n (B( 0,e)) 


> liminf(l + gV) 2 
£—>•0 


R — r 
2 R 


cos 2n a. 


Now since £ = R sin 2a and 0<a< f, as £ —> 0, a — >■ 0. 
continuous (Theorem 25.5, [1]), g a —> 0 as well since Vit(0) 

not less than 



And as the gradient of a convex function is 
= 0. Therefore the lower density of X at 0 is 

□ 


2.4 Proof of Theorem 1.3 

Lemma 1.7 and Proposition 1.6 now combine nicely to give us Theorem 1.3. 

Proof of Theorem 1.3. First, we prove the density result. Without loss of generality, let xq = 0,u(a’o) = 
0, Vu(xo) = 0. Note that by the convexity of u this implies u > 0. Set ko = K(u,x o) = K(u,0), and let 
k > ko be fixed and take K such that k > K > ko. 

Set R = — and note that R— {R 2 — \x\)^ > ^ \x\ 2 = ^|x| 2 , \/x such that \x\ < R. This follows 
immediately by contradiction. The left-hand side of this inequality is the last component of the point 
(x,t) £ M ra , where x £ R", on the (n + l)-dimensional sphere of radius R centered (0, ...,0, R) £ R” +1 (i.e 
the value of d(x), where d is the lower hemisphere function defined in the proof of the proposition, see Fig. 
!)• 

Since I\ > K(u, 0), by Proposition 1.6 (ii) there exists d > 0 such that 

u(0 + h) — u( 0) — (Vu(0), h) < -K\h\ 2 for every \h\ < d. 


So 
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X 


Figure 4 


Thus the sphere S(c, R), where c = (0, ...0, R) G R" +1 , supports the graph of u |_B(o,d) form above at 0 G R" +1 , 
and Lemma 1.7 can be applied to the function u\B(o,d)- 

Let r, such that ^ < r < R, be arbitrary, and let X = X r and Z = Z r be defined as in Lemma 1.7. By 
Proposition 1.6 (iii), Vx G X 


K(u, x ) < 


(i + g 2 )* 

r 


where g = |Vu(a;)|. 


Set 


9e = sup{|Vu(a;)| : |z| < e}. 


Then clearly 

(1 n^) f 

K(u, x)< y ye> Vx G 
r 

By the continuity of the gradient function, lim £ _i.o5£ = |Vu(0)| = 0. Thus since — < k, there exists s', 

r 

where 0 < s' < d, such that 


(1 + 9 ^ 


r 


< k, 


for 0 < e < s', 


and so 


(B(0,e)nl)c (B(0,£)nl[), for 0 <s<e'. 


If x G X then there exists a supporting sphere of radius r at (a:, u(x)), and if x G B( 0, e), where e < s', then 
K(u,x) < k. 

It follows by Lemma 1.7 that 


m n (X' k r\B(0,s)) . ra„(InB(0,e)) 

lim mi-——-— > lim mi--—— 

m n (B(0,s)) e^o m n (B(0,s)) 


> 


R — r 
2 R 


Now recall that R = — was chosen arbitrarily so that it satisfied the inequality — < — where k and 

K k K ko 

k 0 are fixed. Similarly, r was chosen arbitrarily so that — < r < —• Thus we can choose R = — and r 
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arbitrarily close to — and —, respectively, giving us the desired bound 
kq k 

Finally, the fact that X' k := {x G dom (u) : K(u,x) < k} is Borel is contained in Proposition 2.2 and 
Lemma 2.3 below. Let u : R™ — > R convex. Proposition 2.2 shows that the set W on which u is differentiable 
is Borel, specifically a F a s, and Lemma 2.3 proves that K{x) := I\(u,x) is of second Baire class on this set. 
Since K(x) = +oo where Vu doesn’t exists, 

X' k ={i€ dom ( u ) : K{x) < k} = {x GW : K\w{x) < k}. 

It follows immediately that X' k is Borel, as I\ \w : W —I► R is a Borel measurable function. Recall that Baire 
class 1 functions are the pointwise limit of continuous functions and thus Borel measurable, and Baire class 
2 functions are the pointwise limit of Baire class 1 functions and thus also Borel measurable. □ 



Proposition 2.2. Let u : R" —> R be convex. Then the set on which u is differentiable is a dense Borel set, 
specifically an F a g. 


Proof. Since u is convex, u is differentiable at x if and only if all the partial derivatives of u exist at x, with 
respect to any basis [6, IV.4.2]. Let {e.;}” =1 be the standard basis in R™, and define 


f(x, ef) := lim 


f{x + t€i) - /( x) 


Then §^r(x) exists if and only if f'(x, e,) = —f(x, —ej)[6, IV.4.2]. Note that the above limit always exists 
for a convex function and f'(x, ef) > —f'(x, — e*) for all x. Take E to be the set where u is not differentiable 
and E t to be the set of points where §£r{x) does not exist. Then E = U T i l =1 E i . and 

Ei = {f(x, ef) + f'(x, —ef) > 0}. 


If x £ Ei, then there exists N such that for all n > N, 

, f( x ~ f ) - f( x ) ^ 1 
\ + i 

for all k > n. Let 

,f( x +f)~f( x ) , fi x - f) - /<» ^ 1 
i i n 

and note that E n ^ is open since f is continuous (a real-valued convex function). Thus, 

Ei = u~ ! nr = „ E ntk , 

which is clearly a Gg a and so E is also a Gga, being a union of finitely many. Therefore, the set R" \ E on 
which u is differentiable is an F a s. That R" \ E is dense is well-known. □ 



Lemma 2.3. Let u : R” — > R be convex and W C R" the set on which a is differentiable. Then the function 
K{x) := K(u,x) is of second Baire class on W. 

Proof. We follow notes of Slodkowski, not contained in [9], for this proof. Let 

2 

f(x, e) = — max { u(x + eh) — e(Vu(x), h) : \h\ = 1} . 

Then K{x) = limsup e>0 f(x, e). Since u is convex, Vu(: v) is continuous on W, and so f(x, e) is a continuous 
function on W x (0, oo). 

Next, let 

g(x, n) = sup 

Since g{■ ,n) is the supremum of a family of continuous functions it is lower semicontinuous, and thus the 
limit of an increasing sequence of continuous functions on W. Therefore, g{- ,n) is of first Baire class. 

Now, note that 

lim sup f(x, e) = lim g(x,ri), 

e>0 n - > °° 

and thus K{x) is of second Baire class as it is the limit of Baire class one functions. □ 
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3. Dual Perspective 


3.1 Background 

Since u is convex near xo, it is natural to study this quantity K(u, xo) from the dual perspective as well. Let 
Cvx(M. n ) denote the space of convex, lower semi-continuous functions on R". Given a function u £ Cv: r(R"), 
one can apply the Legendre-Fenchel transform J? : Cvx(R n ) CvxiM. n ) of u to obtain its conjugate or 
dual function u*, where 

u* = J^u(s) = sup((s,ai} — u(x)). 

X 

is an order-reversing, involutive transform on Cua^R"), and for sufficiently nice convex functions (differ¬ 
entiable, strictly convex, and 1-coercive), u* is given by 

u*{s) = ( s , (Vu) -1 (s)} - u((Vu) _1 (s)). 

The conjugate function u* can be viewed as a reparametrization of the original function u in terms of its 
tangents using the duality between points and hyperplanes. More specifically, given a vector in R 11 , there is 
an associated family of hyperplanes with that gradient, u* distinguishes the one that supports the epigraph 
of u by specifying a point on that plane. 

For convex functions defined only in a neighbourhood it is standard to extend the function to all of R" by 
setting it equal +oo outside that neighbourhood. In our case, we are given u convex near Xq, so we extend 
it in this manner, if necessary. Clearly this does not affect K(u,x o), which is a purely local property. Recall 
the following basic definitions: 

Definition 3.1. The differentiable function f : R" —> R is convex if for all x,x' £ R" 

fix') > f{x) + {V fix), {x' - x)), 
and strictly convex if the inequality is strict for x ^ x'. 

Definition 3.2. The differentiable function f : R” —> R is strongly convex with modulus c if and only if for 
all (x,x') £ R n x R", 

fix') > fix) + <V/(z), ix' - x)) + ^c\x' - x\ 2 . 

When / is not differentiable a lot of analysis can still be done using the calculus of subdifferentials. 

Definition 3.3. Let f : R” —> R be convex. The subdifferential of f, denoted df, is a set function, where 
d fix) = {seR": fiy) > fix) + {s,y - x) \/y £ R”} . 

Under the Legendre transform, differentiability of u corresponds to convexity or nronotonicity of u* . Recall 
from Proposition 1.8, two properties that transform especially well are (i) u £ C 1 if and only if u* is strictly 
convex, and (ii) u £ C 1 ’ 1 , where Vvt has Lipschitz constant c if and only if u is strongly convex with modulus 

l 

C * 


3.2 Quadratic convexity 

In this section we look at how a bound on Jv(w,x 0 ) or equivalently a sphere of support to the graph of u 
at ixo,uixo )) transforms to a property of u*. More specifically, since K or a sphere of support is a bound 
on a generalized second-order derivative of u, how does this translate to information about the convexity of 
u*l We should expect a more localized property then in Proposition 1.8, as we only have information at x$. 
Further, we are not assuming any regularity beyond differentiable at Xo- 

Now, strong convexity may also defined in terms of quadratic functions: u is strongly convex with 
modulus m if u— ^m\x\ 2 is convex. Similarly, quasi-convexity, is defined via quadratics: u is A- quasi-convex 
if u + |A|a:| 2 is convex. 
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Let u : R" —> 1 be convex with K(u,x o) = ko < oo. By the definition of K{u,x), for any k > ko there 
exists e > 0 such that 

u(x o + h) — u(x o) — (Vu(xo), h) < -k\h\ 2 , for all \h\ < e. 

This motivates the following definition. 

Definition 3.4. Let f : R" —> R be convex. Then f is quadratically (resp. sub-quadratically) convex at xq 
with modulus m > 0 if there exists e > 0 and a quadratic function Q : R n —> R with V 2 Q = ml such that 

f{x 0 ) = Q{x 0 ) and f{x) > Q{ x), \/x £ H(x 0 , e) 

resp. 

f(x 0 ) = Q(x 0 ) and f(x) < Q(x), Vx £ B(x 0 ,e). 

Example 3.5. f(x) = |x| 4 / 3 is quadratically convex at 0, but not sub-quadratically convex at 0. Note also 
that K{f , 0) = +oo and it does not have a sphere of support at 0. 

Example 3.6. More generally, consider any function of the form /(x) = A|x| fc , at x = 0. If 0 < k < 1, / is 
not convex. If fe = 1, / is quadratically convex at 0, but not sub-quadratically convex. If 1 < k < 2 then f is 
strictly convex and quadratically convex but not sub-quadratically convex. If k = 2, / is both quadratically 
convex and sub-quadratically convex. If k > 2, / is sub-quadratically convex but not quadratically convex. 

If / is of the form / = then f* = where i + L = 1. So, in general, given that the Legendre- 
Fenchel transform is order-reversing and quadratics are transformed into quadratics, it follows that if / is 
quadratically convex, f* is sub-quadratically convex. For a convex C 2 function /, if V 2 /(xo) is positive 
definite then / is both quadratically and sub-quadratically convex at xo- 

Proof of Theorem 1.9. Suppose K(u,x o) = fco < oo. As stated above, by definition of K{u,x o), for any 
k > fc 0 , there exists e > 0 such that u satisfies 

u(x) - u(x 0 ) - {Vu(x 0 ),x- x 0 ) < ^k\x - x 0 | 2 , 

for all x £ B{x o, e). Thus, on this neighbourhood of Xq 

u{x) < u(x o) + (Vu(xo), x — x 0 ) + -k\x — xo\ 2 ■ 

By assumption u is convex, and k > ko > 0, so the right-hand side is also convex. Taking the Legendre 
transform gives 

U*(y) > (Vufx o),x 0 ) - u(xo) + {x 0 ,y - Vu(x 0 )) + ^k -—. 

Now u* may not be differentiable at Vit(xo), however Vit(xo) € du(x o) if and only if xo £ du*(Vu(xo)), 
which is equivalent to u*(Vu(xo)) = (Vu(xo),Xo) — u(x o). So the above inequality simplifies to 

U*(y) > u*(Vu(x 0 )) + (x 0 ,y~ Vu(x 0 )) + -^|y - Vu(x 0 )| 2 . 

Note that there is equality at y 0 = Vu(x 0 ) and the Hessian of the right-hand side is jl so u* is quadratically 
convex with modulus ^. 

On the other hand, if u* is quadratically convex at yo = Vu(x o) with modulus ^ then u will be sub- 
quadratically convex with modulus k at xq, and it follows that K(u,Xq) < k. □ 

In the above proof we do not need to worry about du(B(xo,e)) being degenerate (for example if u is 
locally a hyperplane at Xo) because in that case u*(y) will then be +oo away from Vu(xo) so clearly the 
inequality will hold on some neighbourhood. 

Our goal now is to obtain the nice bound on I\{u,x) in Proposition 1.6 using the dual function, given a 
sphere of support to the graph of u at (x,u(x)). The following elementary lemma, which we state without 
proof, will help us to reduce arguments on R" to ones on R. 
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Lemma 3.7. Let S r be an n-sphere with radius r in R” +1 , centered at (0, ...,0, r), and let d : R" —> R be 
the function defined by the lower hemisphere, i.e., for z £ B n (0,r), d(z) = r — \J r 2 — \z\ 2 . Then for any 
x £ B n (0,r) and v £ R”, |u| = 1, the graph of if : I cR-> R n+1 defined by if(t) = d(x + tv) is a lower 
semi-circle in R™ +1 of radius < r, where I = (—e, e') is of maximal length. 


Proposition 3.8. Let f : R" —> R be C 2 and convex and suppose there exists a sphere of support to the 
graph of f at ( Xo,f(xo )) of radius r. Then 


K(f, xo) < 


(1 + V/I^ 0 ) f 

r 


Proof. Because / is C 2 , K(f, Xq) is the largest eigenvalue A max of V 2 /(a’o). If X max =Q or Vf(x) = 0 then the 
bound on K(f,x o) is trivial, so let X ma x > 0 and Vf(x) ^ 0. / is convex so V 2 /( xo) is symmetric positive 
semi-definite, and there exists an orthonormal basis of eigenvectors. Let v be the eigenvector coresponding 
to Xmax- By duality, v is also an eigenvector corresponding to X* min = — —, the smallest eigenvalue of 
V 2 /*(V/(a;o)). This follows from the fact that the Hessians of dual functions satisfy 


V 2 .r(j/o) = V 2 /(x 0 ) \ where y 0 = Vf(x 0 ). 


(Here we assume without loss of generality that V 2 /(£o) is invertible because we are only concerned with 

^max >0). 

Let S((c,t),r) be the sphere of support of radius r, to the graph of / at xo, and d the associated lower 
hemisphere function, i.e. 

d{x) =t — \Jr 2 — |a; — c| 2 , x £ B(c, r) 
d(x) =oo, else. 


Clearly d is convex and d > /, by definition of a supporting sphere. Also, recall that / and d agree up to 
first order at Xo■ 

Again by basic properties of the Legendre transform, the following relations hold: 

f*{y o) = d*(y 0 ) f* > d* V/*(t/ 0 ) = Vd*(y 0 ) = x 0 . 


It follows that 

A* >y* 

''mm — I mm 

where is the smallest eigenvalues of S7 2 d*fS7f(x o)). Note that this is equivalent to 

Xmax — ^ • 

Tmin 


Given this bound, we now show that can always be computed using a function on R. 

Let v 1 be the unit-length eigenvector corresponding to 7 ^ in and "f m ax- By Proposition 2.1, v' is in 
the direction of X7d(x o). By Lemma 3.7, d, the restriction of d to this 1—dimensional subspace defines 
a lower semi-circle function, and this function has the properties: d'(x o) = (Vrl( x<f),v') = |Vd(xo)| and 
d"{xo) = 7max• Therefore, the dual function d* has second derivative at |V/(a:o)| equal to Tmini an< ^ so we 
may assume without loss of generality that / and d are functions on R. 

Now we compute d* directly by using the Legendre transforms of common functions. Rewriting d 


d(x) =t — \Jr 2 — (x — c) 2 


=t — r 


1- 



2 


and then applying the following well-known conjugate pairs: 


h{x) = — \/l — x 2 
g(x) = a + fix + 7 «(Ax + S) 


h*(y) = \J\ + y 2 
g *{x) = -a - 5 y ^ + 


T u *{ 


y-13 

jX 


), 
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gives 


Thus, 


d* (y) =-t + cy + rsj 1 + y 2 

± d '(y) = c+ -3= 

rfy y/i + 2/ 2 

=— !— r . 

dy 1 (l + 2/ 2 )5 


— ^max 


1 

■^^(|V/(x 0 )|) 


(1 + |V/(cr 0 )| 2 )i 

r 


□ 


The more general case, where / is not assumed to be C 2 , will use Proposition 3.8 and quadratic convexity 
of the dual. 


Proposition 3.9. Let f : R” —> R be convex with a sphere of support at Xo of radius r. Then K(f,x o) < 
(l + V/| 2 o)^ 
r 

Proof. Let d be the lower hemisphere function. Then d(xo) = f(x o), and 

d > f => f* > d*. 


If yo = V/(a;o) (which exists since there is a sphere of support) then 


d*(yo) = f*(yo) and Vd* (i/o) G df*(y 0 ). 


From Proposition 3.8 the smallest eigenvalue of V 2 d*(2/o) is equal to 
there exists a neighbourhood U of Xq such that 


---a-, so for any m < 

(i+ls/ol 2 )^ 


_ r 

(1+Ij/oI 2 )^ 


f*(y) > d*{y) > d*(y 0 ) + (Vd*(y 0 ),y - Vo) + ^m\y - y 0 \ 2 - 


Thus, f* is quadratically convex with modulus m. 

It follows that f = (/*)* is sub-quadratically convex at xq with modulus A Let Q m be a satisfying 
quadratic. This implies that 

K{f,x 0 ) < K(Q m ,x 0 ) = —, 

m 

and since this holds for any m < --- r , 

(i+|yol 2 ) 5 


K{f,x o) < 


(l + M 2 ) 1 

r 


(i + |v/M| 2 )l 

r 


□ 


Appendix 


A.l Lipschitz gradient 

Here we show that the generalized derivative K(f 1 x) retains the following standard property regarding the 
derivative of a Lipschitz continuous function. 

Proposition A.l. Suppose / : R n —> R is convex and C 1,1 (i.e / is differentiable and has Lipschitz 
gradient), with Lipschitz constant L. Then K(f,x) < L for all x. 
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Proof. Let xq £ 


K(f,x o) := limsup 2 e 2 max {f(x 0 + eh) - f(x 0 ) - e(Vf(x 0 ),h) : \h\ = 1 }, 
£—>•0 


which can be can written as 


K(f,x„) = lim sup max < 2 /(l ° + th) - /( *° 2 > ~ '‘> : \k\ = 1 f . 


e-vO 


Differentiability lets us use the Cauchy mean value theorem. Let 0i(e) = fix o + eh) — e(V/(xo)> h), and 
02 (e) = e 2 . Note that 

o /Qco + e/t) ^ /(x 0 ) - e(V/(x 0 ), fe) = 2 _0i(e) “ 0i(O) 


> 2 (e) - 02 ( 0 )' 


Thus, there exists 7 £ (0, e) such that 


o 0i( £ ) - 0i(0) = 9 0 Ut) _ (V/(x 0 + 7^) ; h ) - (V/foo), ft) 
02 (e) — 02 (0) 02(7) 7 

_ (V/(x 0 + 7ft) - V/(x0), /i) 


< |V/(a?o+7fe)-V/M| < £ 


Therefore K{f,x 0 ) < L, and thus Xo ^ bounds the modulus of convexity of /*, for any £ 0 . 


□ 


A.2 Example of a non C 1 ’ 1 function with a sphere of support 


Example A.2. It may seem that since a bound on K{u, x) implies a sphere of support to the graph of u at 
{x,u{x)), that this in turn implies some kind Lipschitz continuity of the gradient in a small neighbourhood 
of x. Here we construct an example of a strictly convex function / that is C 1 and twice differentiable with 
K(f, 0) < 00 , but with gradient not Lipschitz in any neighbourhood of 0, to show this is not the case. Let 
/ : [— 1 , 1 ] —> R be given by /( 0 ) = 0 , and for x > 0 


f'(x) = / 7 {t)dt, where 7 (f) := n + 4 on I n and 0 otherwise, 

Jo 


with I n = 


1 


r[l — 


1 


r, !]• Define f(-x) := -f'(x). 


(n + 4 ) 2 {n + 4 ) 2 
Then f is clearly increasing and so / is convex. And for x n = 


1 


n «,)=J\it)x='E w ± TjS <l 


, 00 dt 
< = 


in + 4 ) 2 ’ 
1 


< 


(fc + 4 ) 3 J n+ 3 t 3 2 (n + 3 ) 2 in + 4) 


= X r 


So we have fix) < x for all x £ [0,1] and f(x) > x for all x £ [—1, 0]. Since dfx) > x for all x £ [0,1] 
and d'ix) < x for all a; £ [— 1 , 0 ], it follows that the graph of d, and thus the unit circle centered at ( 0 , 1 ), is 
always at or above the graph of /, with /(0) = d(0). Therefore, / has a sphere of support at xo = 0. 
However, there exist sequences {xi},{xj} such that 

fjxi) - fjXj) 

Xi — Xj 

blows up: Taking Xi and Xj as the endpoints of / n , 


fjXi) - fjXj) 
Xi — Xj 



in + d) 4 


4 dt 


n + 4. 


We can make / strictly convex by adding an x m term, which does not affect any of the above analysis. The 
above example can be adjusted to show that f is not a-Holder continuous for any a. 
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A.3 Osculating and locally supporting spheres 

Here we extend the concept of an osculating circle to a plane curve to that of an “osculating sphere” to the 
graph of a function in higher dimensions. The bound on the “largest eigenvalue ”K(u,x) can be seen as a 
generalization of the relationship between the second derivative of a C 2 plane curve u and the radius of its 
osculating circle: 

Let u : 1 — > R be C 2 . Provided u" ^ 0, the radius of curvature at x is defined as 

1 (1 + u' 2 )i 

r u,X '■— — „ > 

K U" 

where k is the curvature of u at x, and the right-hand side is the standard formula for computing the 
curvature of a planar curve [2, §8]. Thus, 


u n = (1 +W 2 ) 3 / 2 . 
r 

Definition A.3. The osculating circle, or circle of curvature, to a planar curve C at p is the circle that 
touches C (on the concave side) at p and whose radius is the radius of curvature of C at p. 

We extend this to the graphs of C 2 convex functions in higher dimensions by 

Definition A.4. For a convex function u : R" — ► R let the osculating sphere to the graph of u at x be the 
n— sphere tangent to the graph of u at x the with radius equal to that of -r-1—. 

It is easy to show that any tangent sphere at (x,u(x)) with radius less than the osculating sphere at that 
point is a (local) sphere of support. And any tangent sphere at (x,u(x)) with radius greater than the 
osculating sphere cannot be a (local) sphere of support. 

A.4 Spheres of support to a function and its dual 

Given a convex function u with a sphere of support at (xq,u(xo)), the conjugate function u* will not 
necessarily have a sphere of support at the corresponding point (Vu(xo),u*(Vu(xo))- For example take 
u = \\x\ A and u* = ||a;|3. However, for more regular and sufficiently convex functions (e.g. C 2 and locally 
strongly convex), we will have a sphere of support (locally) to both graphs at corresponding points, and the 
order-reversing property of T£ provide a simple inequality relating the radii of these spheres. We state this 
without proof. 

Proposition A.5. Let u : R" —> R be strongly convex and C 2 near xo, and suppose u has a sphere of 
support of radius r xo . If r yo is the radius of a sphere of support to u* at yo = S7u{x<f), then 

(1 + \x\ 2 Y (l + |Vm(x 0 )| 2 )" 
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